11 research outputs found
Axiomatizations for downward XPath on Data Trees
We give sound and complete axiomatizations for XPath with data tests by
"equality" or "inequality", and containing the single "child" axis. This
data-aware logic predicts over data trees, which are tree-like structures whose
every node contains a label from a finite alphabet and a data value from an
infinite domain. The language allows us to compare data values of two nodes but
cannot access the data values themselves (i.e. there is no comparison by
constants). Our axioms are in the style of equational logic, extending the
axiomatization of data-oblivious XPath, by B. ten Cate, T. Litak and M. Marx.
We axiomatize the full logic with tests by "equality" and "inequality", and
also a simpler fragment with "equality" tests only. Our axiomatizations apply
both to node expressions and path expressions. The proof of completeness relies
on a novel normal form theorem for XPath with data tests
Bisimulations on data graphs
Bisimulation provides structural conditions to characterize indistinguishability from an external observer between nodes on labeled graphs. It is a fundamental notion used in many areas, such as verification, graph-structured databases, and constraint satisfaction. However, several current applications use graphs where nodes also contain data (the so called “data graphs”), and where observers can test for equality or inequality of data values (e.g., asking the attribute ‘name’ of a node to be different from that of all its neighbors). The present work constitutes a first investigation of “data aware” bisimulations on data graphs. We study the problem of computing such bisimulations, based on the observational indistinguishability for XPath —a language that extends modal logics like PDL with tests for data equality— with and without transitive closure operators. We show that in general the problem is PSPACE-complete, but identify several restrictions that yield better complexity bounds (CO- NP, PTIME) by controlling suitable parameters of the problem, namely the amount of non-locality allowed, and the class of models considered (graphs, DAGs, trees). In particular, this analysis yields a hierarchy of tractable fragments.Fil: Abriola, Sergio Alejandro. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación En Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación En Ciencias de la Computacion; ArgentinaFil: Barceló, Pablo. Universidad de Chile; ChileFil: Figueira, Diego. Centre National de la Recherche Scientifique; FranciaFil: Figueira, Santiago. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación En Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación En Ciencias de la Computacion; Argentin
A logical framework to study concept-learning biases in the presence of multiple explanations
When people seek to understand concepts from an incomplete set of examples and counterexamples, there is usually an exponentially large number of classification rules that can correctly classify the observed data, depending on which features of the examples are used to construct these rules. A mechanistic approximation of human concept-learning should help to explain how humans prefer some rules over others when there are many that can be used to correctly classify the observed data. Here, we exploit the tools of propositional logic to develop an experimental framework that controls the minimal rules that are simultaneously consistent with the presented examples. For example, our framework allows us to present participants with concepts consistent with a disjunction and also with a conjunction, depending on which features are used to build the rule. Similarly, it allows us to present concepts that are simultaneously consistent with two or more rules of different complexity and using different features. Importantly, our framework fully controls which minimal rules compete to explain the examples and is able to recover the features used by the participant to build the classification rule, without relying on supplementary attention-tracking mechanisms (e.g. eye-tracking). We exploit our framework in an experiment with a sequence of such competitive trials, illustrating the emergence of various transfer effects that bias participants’ prior attention to specific sets of features during learning.Fil: Abriola, Sergio Alejandro. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; ArgentinaFil: Tano, Pablo. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; ArgentinaFil: Romano, Sergio Gaston. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; ArgentinaFil: Figueira, Santiago. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; Argentin
Data-graph repairs: the preferred approach
Repairing inconsistent knowledge bases is a task that has been assessed, with
great advances over several decades, from within the knowledge representation
and reasoning and the database theory communities. As information becomes more
complex and interconnected, new types of repositories, representation languages
and semantics are developed in order to be able to query and reason about it.
Graph databases provide an effective way to represent relationships among data,
and allow processing and querying these connections efficiently. In this work,
we focus on the problem of computing preferred (subset and superset) repairs
for graph databases with data values, using a notion of consistency based on a
set of Reg-GXPath expressions as integrity constraints. Specifically, we study
the problem of computing preferred repairs based on two different preference
criteria, one based on weights and the other based on multisets, showing that
in most cases it is possible to retain the same computational complexity as in
the case where no preference criterion is available for exploitation.Comment: arXiv admin note: text overlap with arXiv:2206.0750
An epistemic approach to model uncertainty in data-graphs
Graph databases are becoming widely successful as data models that allow to
effectively represent and process complex relationships among various types of
data. As with any other type of data repository, graph databases may suffer
from errors and discrepancies with respect to the real-world data they intend
to represent. In this work we explore the notion of probabilistic unclean graph
databases, previously proposed for relational databases, in order to capture
the idea that the observed (unclean) graph database is actually the noisy
version of a clean one that correctly models the world but that we know
partially. As the factors that may be involved in the observation can be many,
e.g, all different types of clerical errors or unintended transformations of
the data, we assume a probabilistic model that describes the distribution over
all possible ways in which the clean (uncertain) database could have been
polluted. Based on this model we define two computational problems: data
cleaning and probabilistic query answering and study for both of them their
corresponding complexity when considering that the transformation of the
database can be caused by either removing (subset) or adding (superset) nodes
and edges.Comment: 25 pages, 3 figure
On the complexity of finding set repairs for data-graphs
In the deeply interconnected world we live in, pieces of information link
domains all around us. As graph databases embrace effectively relationships
among data and allow processing and querying these connections efficiently,
they are rapidly becoming a popular platform for storage that supports a wide
range of domains and applications. As in the relational case, it is expected
that data preserves a set of integrity constraints that define the semantic
structure of the world it represents. When a database does not satisfy its
integrity constraints, a possible approach is to search for a 'similar'
database that does satisfy the constraints, also known as a repair. In this
work, we study the problem of computing subset and superset repairs for graph
databases with data values using a notion of consistency based on a set of
Reg-GXPath expressions as integrity constraints. We show that for positive
fragments of Reg-GXPath these problems admit a polynomial-time algorithm, while
the full expressive power of the language renders them intractable.Comment: 35 pages , including Appendi
Model theory of XPath on data trees. Part II: Binary bisimulation and definability
We study the expressive power of the downward and vertical fragments of XPath equipped with (in)equality tests over possibly infinite data trees. We introduce a suitable notion of saturation with respect to node expressions, and show that over saturated data trees, the already studied notion of (unary) bisimulation coincides with the idea of ‘indistinguishability by means of node expressions’. We also prove definability and separation theorems for classes of pointed data trees. We introduce new notions of binary bisimulations, which relate two pairs of nodes of data trees. We show that over finitely branching data trees, these notions correspond to the idea of ‘indistinguishability by means of path expressions’. We prove a characterization theorem, which describes when a first-order formula with two free variables is expressible in the downward fragment. We show definability and separation theorems, for classes of two-pointed data trees and in the context of path expressions.Fil: Abriola, Sergio Alejandro. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación; ArgentinaFil: Descotte, María Emilia. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigaciones Matemáticas "Luis A. Santaló". Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigaciones Matemáticas "Luis A. Santaló"; ArgentinaFil: Figueira, Santiago. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación; Argentin
Linearizing well quasi-orders and bounding the length of bad sequences
We study the length functions of controlled bad sequences over some well quasi-orders (wqo's) and classify them in the Fast Growing Hierarchy. We develop a new and self-contained study of the length of bad sequences over the disjoint product in Nn (Dickson's Lemma), which leads to recently discovered upper bounds but through a simpler argument. We also give a tight upper bound for the length of controlled decreasing sequences of multisets of Nn with the underlying lexicographic ordering, and use it to give an upper bound for the length of controlled bad sequences in the majoring ordering with the underlying disjoint product ordering. We apply this last result to attain complexity upper bounds for the emptiness problem of itca and atra automata. For the case of the product and majoring wqo's the idea is to linearize bad sequences, i.e. to transform a bad sequence over a wqo into a decreasing one over a well-order, for which upper bounds can be more easily handled.Fil: Abriola, Sergio Alejandro. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Figueira, Santiago. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Senno, Gabriel Ignacio. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentin
Definability for Downward and Vertical XPath on Data Trees
We study the expressive power of the downward and vertical fragments of XPath equipped with (in)equality tests over data trees. We give necessary and sufficient conditions for a class of pointed data trees to be definable by a set of formulas or by a single formula of each of the studied logics. To do so, we introduce a notion of saturation, and show that over saturated data trees bisimulation coincides with logical equivalence.Fil: Abriola, Sergio Alejandro. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Descotte, María Emilia. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Matemática; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Figueira, Santiago. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentin